Skip to content

Refine all matmul and binary ops #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 22, 2020
Merged

Conversation

pinzhenx
Copy link
Contributor

@pinzhenx pinzhenx commented Jun 5, 2020

Fix #9
Fix #48
Fix #50

@pinzhenx pinzhenx marked this pull request as draft June 5, 2020 06:41
@jiayisunx
Copy link
Contributor

relevant UT:
def test_bmm_(self):
ipex.core.enable_auto_dnnl()
ipex.core.enable_mix_bf16_fp32()
a = torch.randn(8, 12, 128, 64).to('dpcpp')
b = torch.randn(8, 12, 128, 64).to('dpcpp')
attention_scores = torch.matmul(a, b.transpose(-1, -2))

@pinzhenx pinzhenx force-pushed the resize branch 6 times, most recently from 68fb1cf to e24f42d Compare June 19, 2020 03:27
@pinzhenx pinzhenx marked this pull request as ready for review June 19, 2020 09:02
@pinzhenx
Copy link
Contributor Author

@EikanWang The PR is ready for review. PTAL.

@pinzhenx pinzhenx changed the title fix *mm ops and support resizing behavior Refine all matmul and binary ops Jun 19, 2020
@pinzhenx
Copy link
Contributor Author

pinzhenx commented Jun 19, 2020

  • throw exceptions on broadcastable inputs for fallback
    Some matmul and binary ops support broadcast while dnnl does not. So we will check the broadcastable inputs and throwing an exception to bypass these cases. Plus, we can now safely remove the workaround code in jit that temporarily disabling dnnl ops during bn folding.
    For those topologies that’s heavily relied on broadcasting, it might cause performance issue, and we should disable some ops on a case-by-case basis later or implement broadcast in dnnl.

  • strengthen shape check for matmul
    Our old implementation shared some code between 2d matmul and 3d matmul, which was causing some issues. Now they are completely separated.

  • add dil_size
    We will query the size of a tensor to check shape for matmul. dil_size can prevent it falling back to cpu

  • refine out=... behavior.
    Now all ops with a out=... parameter will basically discard the old storage and replace its storage with a dil tensor yielded from dil ops. No matter whether its underlying storage is enough or not, we will discard it at all. This is slightly different from pytorch resizing behavior, where the storage will be replaced only if it’s smaller than the required output size.

@@ -89,6 +89,15 @@ void reorder_to_desc(const at::Tensor& tensor, const dil::tensor::desc& expected
}

void equip_dil_buffer(const at::Tensor& tensor, dil::tensor dil_tensor_buffer) {
TORCH_INTERNAL_ASSERT_DEBUG_ONLY(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we user TORCH_CHECK here?

// After equip_dil_buffer(), whole storage should be managed by dil tensor,
// and thus storage metadata should be overwritten by dil tensor
// Note: Storage::set_numel() might be removed later
ipex_tensor_impl->storage().set_numel(dil_tensor_buffer.get_nelems());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean the number of elements of dil tensor will not be as same as the number of ATen tensor (dilation/padding) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may pass a dummy tensor with zero size as output and let our ops to replace its buffer with concrete sizes.

@pinzhenx pinzhenx marked this pull request as draft June 22, 2020 02:31
@pinzhenx pinzhenx marked this pull request as ready for review June 22, 2020 02:52
@EikanWang
Copy link
Contributor

LGTM

@EikanWang EikanWang self-requested a review June 22, 2020 03:19
@EikanWang EikanWang merged commit 38f9fa7 into intel:master Jun 22, 2020
zhuhaozhe pushed a commit to zhuhaozhe/intel-extension-for-pytorch that referenced this pull request Jun 24, 2020
EikanWang pushed a commit that referenced this pull request Oct 4, 2021
* add ops into blacklist

* clean format

* remove tok from blacklist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revisit all *_out ops baddbmm might output random result matmul does not support broadcast.
3 participants